Load the necessary libraries
library(tidyverse) # for data wrangling etc
library(rstanarm) # for fitting models in STAN
library(cmdstanr) # for cmdstan
library(brms) # for fitting models in STAN
library(coda) # for diagnostics
library(bayesplot) # for diagnostics
library(ggmcmc) # for MCMC diagnostics
library(DHARMa) # for residual diagnostics
library(rstan) # for interfacing with STAN
library(emmeans) # for marginal means etc
library(broom) # for tidying outputs
library(tidybayes) # for more tidying outputs
library(ggeffects) # for partial plots
library(broom.mixed) # for summarising models
library(ggeffects) # for partial effects plots
library(bayestestR) # for ROPE
library(see) # for some plots
theme_set(theme_grey()) # put the default ggplot theme back
source("helperFunctions.R")
Polis et al. (1998) were interested in modelling the presence/absence of lizards (Uta sp.) against the perimeter to area ratio of 19 islands in the Gulf of California.
Uta lizard
Format of polis.csv data file
| ISLAND | RATIO | PA |
|---|---|---|
| Bota | 15.41 | 1 |
| Cabeza | 5.63 | 1 |
| Cerraja | 25.92 | 1 |
| Coronadito | 15.17 | 0 |
| .. | .. | .. |
| ISLAND | Categorical listing of the name of the 19 islands used - variable not used in analysis. |
| RATIO | Ratio of perimeter to area of the island. |
| PA | Presence (1) or absence (0) of Uta lizards on island. |
The aim of the analysis is to investigate the relationship between island perimeter to area ratio and the presence/absence of Uta lizards.
polis <- read_csv("../public/data/polis.csv", trim_ws = TRUE)
polis %>% glimpse()
## Rows: 19
## Columns: 3
## $ ISLAND <chr> "Bota", "Cabeza", "Cerraja", "Coronadito", "Flecha", "Gemelose"…
## $ RATIO <dbl> 15.41, 5.63, 25.92, 15.17, 13.04, 18.85, 30.95, 22.87, 12.01, 1…
## $ PA <dbl> 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1
The individual responses (\(y_i\), observed presence/absence of Uta lizards) are each expected to have been independently drawn from Bernoulli (or binomial) distributions (\(\mathcal{Bin}\)). These distributions represent all the possible presence/absences we could have obtained at the specific (\(i^th\)) level of island perimeter to area ratio. Hence the \(i^th\) presence/absence observation is expected to have been drawn from a binomial distribution with a probability of \(\mu_i\) and size of (\(n=1\)).
The expected probabilities are related to the linear predictor (intercept plus slope associated with perimeter to area ratio) via a logit link.
We need to supply priors for each of the parameters to be estimated (\(\beta_0\) and \(\beta_1\)). Whilst we want these priors to be sufficiently vague as to not influence the outcomes of the analysis (and thus be equivalent to the frequentist analysis), we do not want the priors to be so vague (wide) that they permit the MCMC sampler to drift off into parameter space that is both illogical as well as numerically awkward.
As a starting point, lets assign the following priors:
Note, when fitting models through either rstanarm or
brms, the priors assume that the predictor(s) have been
centred and are to be applied on the link scale. In this case the link
scale is an identity.
Model formula: \[ \begin{align} y_i &\sim{} \mathcal{Bin}(n, p_i)\\ ln\left(\frac{p_i}{1-p_i}\right) &= \beta_0 + \beta_1 x_i\\ \beta_0 &\sim{} \mathcal{N}(0,2.5)\\ \beta_1 &\sim{} \mathcal{N}(0,1)\\ \end{align} \]
summary(glm(PA ~ RATIO, data = polis, family = binomial()))
##
## Call:
## glm(formula = PA ~ RATIO, family = binomial(), data = polis)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.6067 -0.6382 0.2368 0.4332 2.0986
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.6061 1.6953 2.127 0.0334 *
## RATIO -0.2196 0.1005 -2.184 0.0289 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 26.287 on 18 degrees of freedom
## Residual deviance: 14.221 on 17 degrees of freedom
## AIC: 18.221
##
## Number of Fisher Scoring iterations: 6
In rstanarm, the default priors are designed to be
weakly informative. They are chosen to provide moderate regularisation
(to help prevent over-fitting) and help stabilise the computations.
polis.rstanarm <- stan_glm(PA ~ RATIO,
data = polis,
family = binomial(),
iter = 5000, warmup = 1000,
chains = 3, thin = 5, refresh = 0
)
In the above:
Having allowed rstanarm to formulate its own weakly
informative priors, it is a good idea to explore what they are. Firstly,
out of curiosity, it might be interesting to see what it has chosen.
However, more importantly, we need to be able to document what the
priors were and the rstanarm development team make it very
clear that there is no guarantee that the default priors will remain the
same into the future.
prior_summary(polis.rstanarm)
## Priors for model 'polis.rstanarm'
## ------
## Intercept (after predictors centered)
## ~ normal(location = 0, scale = 2.5)
##
## Coefficients
## Specified prior:
## ~ normal(location = 0, scale = 2.5)
## Adjusted prior:
## ~ normal(location = 0, scale = 0.14)
## ------
## See help('prior_summary.stanreg') for more details
This tells us:
rstanarm takes the view that if the
predictors are centred (which they are by default), the mean of the
response will be close to 0.5. On the logit scale, this equates to a
value of 0. Hence, the prior on intercept is defined as a normal prior
with a mean of 0. For logistic regression, the response values must all
be either 0 or 1. On the logit scale, values can range from -Inf to Inf.
Nevertheless, values on the logit scale that exceed 2.5 in magnitude
equate to value on the response scale of either very high or very low
(if negative). rstanarm therefore uses a standard deviation
of 2.5 for the normal prior on the intercept.mean(polis$PA)
## [1] 0.5263158
2.5 * 1 / sd(polis$RATIO)
## [1] 0.1429651
One way to assess the priors is to have the MCMC sampler sample purely from the prior predictive distribution without conditioning on the observed data. Doing so provides a glimpse at the range of predictions possible under the priors. On the one hand, wide ranging predictions would ensure that the priors are unlikely to influence the actual predictions once they are conditioned on the data. On the other hand, if they are too wide, the sampler is being permitted to traverse into regions of parameter space that are not logically possible in the context of the actual underlying ecological context. Not only could this mean that illogical parameter estimates are possible, when the sampler is traversing regions of parameter space that are not supported by the actual data, the sampler can become unstable and have difficulty.
We can draw from the prior predictive distribution instead of
conditioning on the response, by updating the model and indicating
prior_PD=TRUE. After refitting the model in this way, we
can plot the predictions to gain insights into the range of predictions
supported by the priors alone.
polis.rstanarm1 <- update(polis.rstanarm, prior_PD = TRUE)
ggemmeans(polis.rstanarm1, ~RATIO) %>% plot(add.data = TRUE)
Conclusions:
The following link provides some guidance about defining priors. [https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations]
When defining our own priors, we typically do not want them to be scaled.
If we wanted to define our own priors that were less vague, yet still not likely to bias the outcomes, we could try the following priors (mainly plucked out of thin air):
2.5 * sd(polis$PA)/sd(polis$RATIO))I will also overlay the raw data for comparison.
polis.rstanarm2 <- stan_glm(PA ~ RATIO,
data = polis,
family = binomial(),
prior_intercept = normal(0, 2.5, autoscale = FALSE),
prior = normal(0, 0.1, autoscale = FALSE),
prior_PD = TRUE,
iter = 5000, warmup = 1000,
chains = 3, thin = 5, refresh = 0
)
ggemmeans(polis.rstanarm2, ~RATIO) %>%
plot(add.data = TRUE)
Now lets refit, conditioning on the data.
polis.rstanarm3 <- update(polis.rstanarm2, prior_PD = FALSE)
posterior_vs_prior(polis.rstanarm3,
color_by = "vs", group_by = TRUE,
facet_args = list(scales = "free_y")
)
Conclusions:
ggemmeans(polis.rstanarm3, ~RATIO) %>% plot(add.data = TRUE)
ggemmeans(polis.rstanarm3, terms = "RATIO[0:63]") %>%
plot(
add.data = TRUE,
residuals = TRUE,
jitter = FALSE
)
# OR
polis.rstanarm3 %>%
ggpredict(~RATIO) %>%
plot(add.data = TRUE)
In brms, the default priors are designed to be weakly
informative. They are chosen to provide moderate regularisation (to help
prevent over fitting) and help stabilise the computations.
Unlike rstanarm, brms models must be
compiled before they start sampling. For most models, the compilation of
the stan code takes around 45 seconds.
polis.brm <- brm(bf(PA | trials(1) ~ RATIO, family = binomial()),
data = polis,
iter = 5000,
warmup = 1000,
chains = 3, cores = 3,
thin = 5,
refresh = 0,
backend = "cmdstan"
)
## Running MCMC with 3 parallel chains...
##
## Chain 1 finished in 0.1 seconds.
## Chain 2 finished in 0.1 seconds.
## Chain 3 finished in 0.1 seconds.
##
## All 3 chains finished successfully.
## Mean chain execution time: 0.1 seconds.
## Total execution time: 0.3 seconds.
In the above:
brms can define models that are not possible by most other
routines. To facilitate this enhanced functionality, we usually define a
brms formula within its own bf() function
along with the family (in this case, it is Gaussian, which
is the default and therefore can be omitted.)Having allowed brms to formulate its own weakly
informative priors, it is a good idea to explore what they are. Firstly,
out of curiosity, it might be interesting to see what it has chosen.
However, more importantly, we need to be able to document what the
priors were and the brms development team make it very
clear that there is no guarantee that the default priors will remain the
same into the future.
polis.brm %>% prior_summary()
## prior class coef group resp dpar nlpar lb ub source
## (flat) b default
## (flat) b RATIO (vectorized)
## student_t(3, 0, 2.5) Intercept default
This tells us:
for the intercept, it is using a student t (flatter normal) prior
with a mean of 0 and a standard deviation of 2.5. These are the defaults
used for a binomial model. The mean of the response is approximately
0.53. All priors are applied on the link scale (which by default for
logistic regression is logit). brms takes the view that if
the predictors are centred (which they are by default), the mean of the
response will be close to 0.5. On the logit scale, this equates to a
value of 0. Hence, the prior on intercept is defined as a normal prior
with a mean of 0. For logistic regression, the response values must all
be either 0 or 1. On the logit scale, values can range from -Inf to Inf.
Nevertheless, values on the logit scale that exceed 2.5 in magnitude
equate to value on the response scale of either very high or very low
(if negative). brms therefore uses a standard deviation of
2.5 for the normal prior on the intercept.
for the beta coefficients (in this case, just the slope), the default prior is a improper flat prior. A flat prior essentially means that any value between negative infinity and positive infinity are equally likely. Whilst this might seem reckless, in practice, it seems to work reasonably well for non-intercept beta parameters.
there are no distributional parameters in a binomial model and therefore there are no additional priors.
One way to assess the priors is to have the MCMC sampler sample purely from the prior predictive distribution without conditioning on the observed data. Doing so provides a glimpse at the range of predictions possible under the priors. On the one hand, wide ranging predictions would ensure that the priors are unlikely to influence the actual predictions once they are conditioned on the data. On the other hand, if they are too wide, the sampler is being permitted to traverse into regions of parameter space that are not logically possible in the context of the actual underlying ecological context. Not only could this mean that illogical parameter estimates are possible, when the sampler is traversing regions of parameter space that are not supported by the actual data, the sampler can become unstable and have difficulty.
In brms, we can inform the sampler to draw from the
prior predictive distribution instead of conditioning on the response,
by running the model with the sample_prior='only' argument.
Unfortunately, this cannot be applied when there are flat priors (since
the posteriors will necessarily extend to negative and positive
infinity). Therefore, in order to use this useful routine, we need to
make sure that we have defined a proper prior for all parameters.
In this case, we will define alternative priors for the slope. Default priors for the intercept are already provided (should we wish to use them).
priors <- prior(normal(0, 2.5), class = "Intercept") +
prior(normal(0, 1), class = "b")
polis.brm1 <- brm(bf(PA | trials(1) ~ RATIO, family = binomial()),
data = polis,
prior = priors,
sample_prior = "only",
iter = 5000,
warmup = 1000,
chains = 3, cores = 3,
thin = 5,
refresh = 0,
backend = "cmdstan"
)
## Running MCMC with 3 parallel chains...
##
## Chain 1 finished in 0.0 seconds.
## Chain 2 finished in 0.0 seconds.
## Chain 3 finished in 0.0 seconds.
##
## All 3 chains finished successfully.
## Mean chain execution time: 0.0 seconds.
## Total execution time: 0.2 seconds.
The following are unlikely to be that useful as the predictions must be bound to the range of [0,1]. Hence, it will be difficult to assess whether the priors could result in very large or small predictions.
polis.brm1 %>%
ggemmeans(~RATIO) %>%
plot(add.data = TRUE)
polis.brm1 %>%
conditional_effects() %>%
plot(points = TRUE)
Conclusions:
The following link provides some guidance about defining priors. [https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations]
When defining our own priors, we typically do not want them to be scaled.
If we wanted to define our own priors that were less vague, yet still not likely to bias the outcomes, we could try the following priors (mainly plucked out of thin air):
2.5 * sd(polis$PA)/sd(polis$RATIO))I will also overlay the raw data for comparison.
standist::visualize("normal(0, 2.5)", xlim = c(-10, 100))
priors <- prior(normal(0, 2.5), class = "Intercept") +
prior(normal(0, 0.1), class = "b")
polis.brm2 <- brm(bf(PA | trials(1) ~ RATIO, family = binomial()),
data = polis,
prior = priors,
sample_prior = "only",
iter = 5000,
warmup = 1000,
chains = 3, cores = 3,
thin = 5,
refresh = 0,
backend = "cmdstan"
)
## Running MCMC with 3 parallel chains...
##
## Chain 1 finished in 0.0 seconds.
## Chain 2 finished in 0.0 seconds.
## Chain 3 finished in 0.0 seconds.
##
## All 3 chains finished successfully.
## Mean chain execution time: 0.0 seconds.
## Total execution time: 0.2 seconds.
ggemmeans(polis.brm2, ~RATIO) %>%
plot(add.data = TRUE)
polis.brm3 <- polis.brm2 %>% update(sample_prior = "yes", refresh = 0)
## Running MCMC with 3 sequential chains...
##
## Chain 1 finished in 0.1 seconds.
## Chain 2 finished in 0.1 seconds.
## Chain 3 finished in 0.1 seconds.
##
## All 3 chains finished successfully.
## Mean chain execution time: 0.1 seconds.
## Total execution time: 0.4 seconds.
ggemmeans(polis.brm3, ~RATIO) %>%
plot(add.data = TRUE)
ggemmeans(polis.brm3, terms = "RATIO[0:63]") %>%
plot(add.data = TRUE)
polis.brm3 %>% conditional_effects()
polis.brm3 %>% get_variables()
## [1] "b_Intercept" "b_RATIO" "prior_Intercept" "prior_b"
## [5] "lprior" "lp__" "accept_stat__" "treedepth__"
## [9] "stepsize__" "divergent__" "n_leapfrog__" "energy__"
## polis.brm3 %>% hypothesis('Intercept=0', class='b') %>% plot
polis.brm3 %>%
hypothesis("RATIO=0") %>%
plot()
polis.brm3 %>% get_variables()
## [1] "b_Intercept" "b_RATIO" "prior_Intercept" "prior_b"
## [5] "lprior" "lp__" "accept_stat__" "treedepth__"
## [9] "stepsize__" "divergent__" "n_leapfrog__" "energy__"
polis.brm3 %>% SUYR_prior_and_posterior()
polis.brm3 %>% standata()
## $N
## [1] 19
##
## $Y
## [1] 1 1 1 0 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1
##
## $trials
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##
## $K
## [1] 2
##
## $X
## Intercept RATIO
## 1 1 15.41
## 2 1 5.63
## 3 1 25.92
## 4 1 15.17
## 5 1 13.04
## 6 1 18.85
## 7 1 30.95
## 8 1 22.87
## 9 1 12.01
## 10 1 11.60
## 11 1 6.09
## 12 1 2.28
## 13 1 4.05
## 14 1 59.94
## 15 1 63.16
## 16 1 22.76
## 17 1 23.54
## 18 1 0.21
## 19 1 2.55
## attr(,"assign")
## [1] 0 1
##
## $prior_only
## [1] 0
##
## attr(,"class")
## [1] "standata" "list"
polis.brm3 %>% stancode()
## // generated with brms 2.18.0
## functions {
##
## }
## data {
## int<lower=1> N; // total number of observations
## array[N] int Y; // response variable
## array[N] int trials; // number of trials
## int<lower=1> K; // number of population-level effects
## matrix[N, K] X; // population-level design matrix
## int prior_only; // should the likelihood be ignored?
## }
## transformed data {
## int Kc = K - 1;
## matrix[N, Kc] Xc; // centered version of X without an intercept
## vector[Kc] means_X; // column means of X before centering
## for (i in 2 : K) {
## means_X[i - 1] = mean(X[ : , i]);
## Xc[ : , i - 1] = X[ : , i] - means_X[i - 1];
## }
## }
## parameters {
## vector[Kc] b; // population-level effects
## real Intercept; // temporary intercept for centered predictors
## }
## transformed parameters {
## real lprior = 0; // prior contributions to the log posterior
## lprior += normal_lpdf(b | 0, 0.1);
## lprior += normal_lpdf(Intercept | 0, 2.5);
## }
## model {
## // likelihood including constants
## if (!prior_only) {
## // initialize linear predictor term
## vector[N] mu = rep_vector(0.0, N);
## mu += Intercept + Xc * b;
## target += binomial_logit_lpmf(Y | trials, mu);
## }
## // priors including constants
## target += lprior;
## }
## generated quantities {
## // actual population-level intercept
## real b_Intercept = Intercept - dot_product(means_X, b);
## // additionally sample draws from priors
## real prior_b = normal_rng(0, 0.1);
## real prior_Intercept = normal_rng(0, 2.5);
## }
library(INLA)
In INLA, the default priors are designed to be diffuse
or weak. They are chosen to provide moderate regularisation (to help
prevent over-fitting) and help stabilise the computations.
polis.inla <- inla(PA ~ RATIO,
data = polis,
family = "gaussian",
control.compute = list(config = TRUE, dic = TRUE, waic = TRUE, cpo = TRUE)
)
In the above:
the formula, data and
family arguments should be familiar as they are the same as
for other models in R.
control.compute: allows us to indicate what
additional actions should be performed during the model fitting. In this
case, we have indicated:
dic: Deviance information criterionwaic: Wantanabe information creterioncpo: out-of-sample estimates (measures of fit)config: return the full configuration - to allow
drawing from the posterior.polis.inla %>% names()
## [1] "names.fixed" "summary.fixed"
## [3] "marginals.fixed" "summary.lincomb"
## [5] "marginals.lincomb" "size.lincomb"
## [7] "summary.lincomb.derived" "marginals.lincomb.derived"
## [9] "size.lincomb.derived" "mlik"
## [11] "cpo" "po"
## [13] "waic" "model.random"
## [15] "summary.random" "marginals.random"
## [17] "size.random" "summary.linear.predictor"
## [19] "marginals.linear.predictor" "summary.fitted.values"
## [21] "marginals.fitted.values" "size.linear.predictor"
## [23] "summary.hyperpar" "marginals.hyperpar"
## [25] "internal.summary.hyperpar" "internal.marginals.hyperpar"
## [27] "offset.linear.predictor" "model.spde2.blc"
## [29] "summary.spde2.blc" "marginals.spde2.blc"
## [31] "size.spde2.blc" "model.spde3.blc"
## [33] "summary.spde3.blc" "marginals.spde3.blc"
## [35] "size.spde3.blc" "logfile"
## [37] "misc" "dic"
## [39] "mode" "neffp"
## [41] "joint.hyper" "nhyper"
## [43] "version" "Q"
## [45] "graph" "ok"
## [47] "cpu.used" "all.hyper"
## [49] ".args" "call"
## [51] "model.matrix"
Having allowed INLA to formulate its own “minimally
informative” priors, it is a good idea to explore what they are.
Firstly, out of curiosity, it might be interesting to see what it has
chosen. However, more importantly, we need to be able to document what
the priors were and the INLA development team make it very
clear that there is no guarantee that the default priors will remain the
same into the future.
In calcutating the posterior mode of hyperparameters, it is efficient
to maximising the sum of the (log)-likelihood and the (log)-prior,
hence, priors are defined on a log-scale. The canonical prior for
variance is the gamma prior, hence in INLA, this
is a loggamma.
They are also defined according to their mean and precision (inverse-scale, rather than variance). Precision is \(1/\sigma\).
To explore the default priors used by INLA, we can issue
the following on the fitted model:
inla.priors.used(polis.inla)
## section=[family]
## tag=[INLA.Data1] component=[gaussian]
## theta1:
## parameter=[log precision]
## prior=[loggamma]
## param=[1e+00, 5e-05]
## section=[fixed]
## tag=[(Intercept)] component=[(Intercept)]
## beta:
## parameter=[(Intercept)]
## prior=[normal]
## param=[0, 0]
## tag=[RATIO] component=[RATIO]
## beta:
## parameter=[RATIO]
## prior=[normal]
## param=[0.000, 0.001]
The above indicates:
standist::visualize("gamma(1, 0.00005)", xlim=c(-100,100000))
standist::visualize("normal(0, 31)", xlim=c(-100,100))
Family variance
No existent
Intercept
The default prior on the intercept is a Gaussian with mean of 0 and
precision of 0 (and thus effectively a flat uniform). Alternatively, we
could define priors on the intercept that are more realistic. For
example, we know that the middle probability of Uta lizard
presence is likely to be close to 0.5 0.5263158. However, the parameters
are all on a logit scale. Hence a sensible prior would be
log(0.5/(1-0.5)) = 0 (if we were to centre
RATIO).
We also know that the probability cannot extend above 1 and below 0. That is plus or minus 0.5. On the logit scale, this would be equivalent to approximately plus or minus 1.
mean(polis$PA)
## [1] 0.5263158
standist::visualize("normal(0, 1)", xlim=c(-2,2))
We could use these values as the basis for weekly informative priors
on the intercept. Note, as INLA priors are expressed in
terms of precision rather than variance, an equvilent prior would be
\(\sim{}~\mathcal{N}(0, 1)\)
(e.g. \(1/(1)^2\)).
Fixed effects
The priors for the fixed effects (slope) is a Gaussian (normal) distributions with mean of 0 and precision (0.001). This implies that the prior for slope has a standard deviation of approximately 31 (since \(\sigma = \frac{1}{\sqrt{\tau}}\)). As a general rule, three standard deviations envelopes most of a distribution, and thus this prior defines a distribution whose density is almost entirely within the range [-93,93]. On a logit scale, this is very large.
In order to generate realistic informative Gaussian priors (for the purpose of constraining the posterior to a logical range) for fixed parameters, the following formulae are useful:
\[ \begin{align} \mu &= \frac{z_2\theta_1 - z_1\theta_2}{z_2-z_1}\\ \sigma &= \frac{\theta_2 - \theta_1}{z_2-z_1} \end{align} \]
where \(\theta_1\) and \(\theta_2\) are the quantiles on the
response scale and \(z_1\) and \(z_2\) are the corresponding quantiles on
the standard normal scale. Hence, if we considered that the slope is
likely to be in the range of [-2,2], (which would correspond to a range
of fractional rate changes per one unit change in RATIO of
between -100% and 100%), we could specify a Normal prior with mean of
\(\mu=\frac{(qnorm(0.5,0,1)*0) -
(qnorm(0.975,0,1)*10)}{10-0} = 0\) and a standard deviation of
\(\sigma^2=\frac{10 -
0}{qnorm(0.975,0,1)-qnorm(0.5,0,1)} = 5.102\). In INLA (which
defines priors in terms of precision rather than standard deviation),
the associated prior would be \(\beta \sim{}
\mathcal{N}(0, 0.0384)\).
standist::visualize("normal(0, 2)", xlim=c(-3,3))
In order to define each of the above priors, we could modify the
inla call:
polis.inla1 <- inla(PA ~ scale(RATIO, scale = FALSE),
data = polis,
Ntrials = 1,
family = "binomial",
control.fixed = list(
mean.intercept = 0,
prec.intercept = 0.01,
mean = 0,
prec = 0.25
),
control.compute = list(config = TRUE, dic = TRUE, waic = TRUE, cpo = TRUE)
)
In addition to the regular model diagnostics checking, for Bayesian analyses, it is also necessary to explore the MCMC sampling diagnostics to be sure that the chains are well mixed and have converged on a stable posterior.
There are a wide variety of tests that range from the big picture, overall chain characteristics to the very specific detailed tests that allow the experienced modeller to drill down to the very fine details of the chain behaviour. Furthermore, there are a multitude of packages and approaches for exploring these diagnostics.
The bayesplot package offers a range of MCMC diagnostics
as well as Posterior Probability Checks (PPC), all of which have a
convenient plot() interface. Lets start with the MCMC
diagnostics.
available_mcmc()
## bayesplot MCMC module:
## mcmc_acf
## mcmc_acf_bar
## mcmc_areas
## mcmc_areas_ridges
## mcmc_combo
## mcmc_dens
## mcmc_dens_chains
## mcmc_dens_overlay
## mcmc_hex
## mcmc_hist
## mcmc_hist_by_chain
## mcmc_intervals
## mcmc_neff
## mcmc_neff_hist
## mcmc_nuts_acceptance
## mcmc_nuts_divergence
## mcmc_nuts_energy
## mcmc_nuts_stepsize
## mcmc_nuts_treedepth
## mcmc_pairs
## mcmc_parcoord
## mcmc_rank_hist
## mcmc_rank_overlay
## mcmc_recover_hist
## mcmc_recover_intervals
## mcmc_recover_scatter
## mcmc_rhat
## mcmc_rhat_hist
## mcmc_scatter
## mcmc_trace
## mcmc_trace_highlight
## mcmc_violin
Of these, we will focus on:
plot(polis.rstanarm3, plotfun = "mcmc_trace")
The chains appear well mixed and very similar
plot(polis.rstanarm3, "acf_bar")
There is no evidence of auto-correlation in the MCMC samples
plot(polis.rstanarm3, "rhat_hist")
All Rhat values are below 1.05, suggesting the chains have converged.
neff (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
plot(polis.rstanarm3, "neff_hist")
Ratios all very high.
plot(polis.rstanarm3, "combo")
plot(polis.rstanarm3, "violin")
The rstan package offers a range of MCMC diagnostics.
Lets start with the MCMC diagnostics.
Of these, we will focus on:
stan_trace(polis.rstanarm3)
The chains appear well mixed and very similar
stan_ac(polis.rstanarm3)
There is no evidence of auto-correlation in the MCMC samples
stan_rhat(polis.rstanarm3)
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
stan_ess(polis.rstanarm3)
Ratios all very high.
stan_dens(polis.rstanarm3, separate_chains = TRUE)
The ggmean package also has a set of MCMC diagnostic
functions. Lets start with the MCMC diagnostics.
Of these, we will focus on:
polis.ggs <- ggs(polis.rstanarm3)
ggs_traceplot(polis.ggs)
The chains appear well mixed and very similar
ggs_autocorrelation(polis.ggs)
There is no evidence of autocorrelation in the MCMC samples
ggs_Rhat(polis.ggs)
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
ggs_effective(polis.ggs)
Ratios all very high.
ggs_crosscorrelation(polis.ggs)
ggs_grb(polis.ggs)
The bayesplot package offers a range of MCMC diagnostics
as well as Posterior Probability Checks (PPC), all of which have a
convenient plot() interface. Lets start with the MCMC
diagnostics.
available_mcmc()
## bayesplot MCMC module:
## mcmc_acf
## mcmc_acf_bar
## mcmc_areas
## mcmc_areas_ridges
## mcmc_combo
## mcmc_dens
## mcmc_dens_chains
## mcmc_dens_overlay
## mcmc_hex
## mcmc_hist
## mcmc_hist_by_chain
## mcmc_intervals
## mcmc_neff
## mcmc_neff_hist
## mcmc_nuts_acceptance
## mcmc_nuts_divergence
## mcmc_nuts_energy
## mcmc_nuts_stepsize
## mcmc_nuts_treedepth
## mcmc_pairs
## mcmc_parcoord
## mcmc_rank_hist
## mcmc_rank_overlay
## mcmc_recover_hist
## mcmc_recover_intervals
## mcmc_recover_scatter
## mcmc_rhat
## mcmc_rhat_hist
## mcmc_scatter
## mcmc_trace
## mcmc_trace_highlight
## mcmc_violin
Of these, we will focus on:
polis.brm3 %>% mcmc_plot(type = "trace")
The chains appear well mixed and very similar
polis.brm3 %>% mcmc_plot(type = "acf_bar")
There is no evidence of autocorrelation in the MCMC samples
polis.brm3 %>% mcmc_plot(type = "rhat_hist")
All Rhat values are below 1.05, suggesting the chains have converged.
neff_hist (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
polis.brm3 %>% mcmc_plot(type = "neff_hist")
Ratios all very high.
polis.brm3 %>% mcmc_plot(type = "combo")
polis.brm3 %>% mcmc_plot(type = "violin")
The rstan package offers a range of MCMC diagnostics.
Lets start with the MCMC diagnostics.
Of these, we will focus on:
polis.brm3$fit %>% stan_trace()
The chains appear well mixed and very similar
polis.brm3$fit %>% stan_ac()
There is no evidence of autocorrelation in the MCMC samples
polis.brm3$fit %>% stan_rhat()
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
polis.brm3$fit %>% stan_ess()
Ratios all very high.
stan_dens(polis.brm3$fit, separate_chains = TRUE)
The ggmean package also has a set of MCMC diagnostic
functions. Lets start with the MCMC diagnostics.
Of these, we will focus on:
polis.ggs <- polis.brm3 %>% ggs(inc_warmup = FALSE, burnin = FALSE)
polis.ggs %>% ggs_traceplot()
The chains appear well mixed and very similar
polis.ggs %>% ggs_autocorrelation()
There is no evidence of autocorrelation in the MCMC samples
polis.ggs %>% ggs_Rhat()
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
polis.ggs %>% ggs_effective()
Ratios all very high.
polis.ggs %>% ggs_crosscorrelation()
polis.ggs %>% ggs_grb()
Post predictive checks provide additional diagnostics about the fit of the model. Specifically, they provide a comparison between predictions drawn from the model and the observed data used to train the model.
available_ppc()
## bayesplot PPC module:
## ppc_bars
## ppc_bars_grouped
## ppc_boxplot
## ppc_dens
## ppc_dens_overlay
## ppc_dens_overlay_grouped
## ppc_ecdf_overlay
## ppc_ecdf_overlay_grouped
## ppc_error_binned
## ppc_error_hist
## ppc_error_hist_grouped
## ppc_error_scatter
## ppc_error_scatter_avg
## ppc_error_scatter_avg_grouped
## ppc_error_scatter_avg_vs_x
## ppc_freqpoly
## ppc_freqpoly_grouped
## ppc_hist
## ppc_intervals
## ppc_intervals_grouped
## ppc_km_overlay
## ppc_km_overlay_grouped
## ppc_loo_intervals
## ppc_loo_pit
## ppc_loo_pit_overlay
## ppc_loo_pit_qq
## ppc_loo_ribbon
## ppc_ribbon
## ppc_ribbon_grouped
## ppc_rootogram
## ppc_scatter
## ppc_scatter_avg
## ppc_scatter_avg_grouped
## ppc_stat
## ppc_stat_2d
## ppc_stat_freqpoly
## ppc_stat_freqpoly_grouped
## ppc_stat_grouped
## ppc_violin_grouped
pp_check(polis.rstanarm3, plotfun = "dens_overlay")
The model draws appear to be consistent with the observed data.
pp_check(polis.rstanarm3, plotfun = "error_scatter_avg")
This is not interpretable.
pp_check(polis.rstanarm3, x = polis$RATIO, plotfun = "error_scatter_avg_vs_x")
pp_check(polis.rstanarm3, x = polis$RATIO, plotfun = "intervals")
The modelled predictions are mostly consistent with the observed data. There is really only one exception.
pp_check(polis.rstanarm3, x = polis$RATIO, plotfun = "ribbon")
The shinystan package allows the full suite of MCMC
diagnostics and posterior predictive checks to be accessed via a web
interface.
# library(shinystan)
# launch_shinystan(polis.rstanarm3)
DHARMa residuals provide very useful diagnostics. Unfortunately, we
cannot directly use the simulateResiduals() function to
generate the simulated residuals. However, if we are willing to
calculate some of the components yourself, we can still obtain the
simulated residuals from the fitted stan model.
We need to supply:
preds <- posterior_predict(polis.rstanarm3, ndraws = 250, summary = FALSE)
polis.resids <- createDHARMa(
simulatedResponse = t(preds),
observedResponse = polis$PA,
fittedPredictedResponse = apply(preds, 2, median),
integerResponse = TRUE
)
plot(polis.resids)
Conclusions:
Post predictive checks provide additional diagnostics about the fit of the model. Specifically, they provide a comparison between predictions drawn from the model and the observed data used to train the model.
available_ppc()
## bayesplot PPC module:
## ppc_bars
## ppc_bars_grouped
## ppc_boxplot
## ppc_dens
## ppc_dens_overlay
## ppc_dens_overlay_grouped
## ppc_ecdf_overlay
## ppc_ecdf_overlay_grouped
## ppc_error_binned
## ppc_error_hist
## ppc_error_hist_grouped
## ppc_error_scatter
## ppc_error_scatter_avg
## ppc_error_scatter_avg_grouped
## ppc_error_scatter_avg_vs_x
## ppc_freqpoly
## ppc_freqpoly_grouped
## ppc_hist
## ppc_intervals
## ppc_intervals_grouped
## ppc_km_overlay
## ppc_km_overlay_grouped
## ppc_loo_intervals
## ppc_loo_pit
## ppc_loo_pit_overlay
## ppc_loo_pit_qq
## ppc_loo_ribbon
## ppc_ribbon
## ppc_ribbon_grouped
## ppc_rootogram
## ppc_scatter
## ppc_scatter_avg
## ppc_scatter_avg_grouped
## ppc_stat
## ppc_stat_2d
## ppc_stat_freqpoly
## ppc_stat_freqpoly_grouped
## ppc_stat_grouped
## ppc_violin_grouped
polis.brm3 %>% pp_check(type = "dens_overlay", ndraws = 100)
The model draws appear to be consistent with the observed data.
polis.brm3 %>% pp_check(type = "error_scatter_avg")
This is not really interpretable
polis.brm3 %>% pp_check(x = "RATIO", type = "error_scatter_avg_vs_x")
polis.brm3 %>% pp_check(x = "RATIO", type = "intervals")
polis.brm3 %>% pp_check(x = "RATIO", type = "ribbon")
The shinystan package allows the full suite of MCMC
diagnostics and posterior predictive checks to be accessed via a web
interface.
# library(shinystan)
# launch_shinystan(polis.brm3)
DHARMa residuals provide very useful diagnostics. Unfortunately, we
cannot directly use the simulateResiduals() function to
generate the simulated residuals. However, if we are willing to
calculate some of the components yourself, we can still obtain the
simulated residuals from the fitted stan model.
We need to supply:
preds <- polis.brm3 %>% posterior_predict(ndraws = 250, summary = FALSE)
polis.resids <- createDHARMa(
simulatedResponse = t(preds),
observedResponse = polis$PA,
fittedPredictedResponse = apply(preds, 2, median),
integerResponse = TRUE
)
polis.resids %>% plot()
Conclusions:
polis.rstanarm3 %>%
ggpredict() %>%
plot(add.data = TRUE, jitter = FALSE)
## $RATIO
polis.rstanarm3 %>%
ggemmeans(~RATIO) %>%
plot(add.data = TRUE)
polis.rstanarm3 %>%
fitted_draws(newdata = polis) %>%
median_hdci() %>%
ggplot(aes(x = RATIO, y = .value)) +
geom_ribbon(aes(ymin = .lower, ymax = .upper), fill = "blue", alpha = 0.3) +
geom_line()
polis.brm3 %>%
conditional_effects() %>%
plot(points = TRUE)
polis.brm3 %>%
conditional_effects(spaghetti = TRUE, nsamples = 500) %>%
plot(points = TRUE)
polis.brm3 %>%
ggpredict() %>%
plot(add.data = TRUE, jitter = FALSE)
## $RATIO
polis.brm3 %>%
ggemmeans(~RATIO) %>%
plot(add.data = TRUE)
polis.brm3 %>%
fitted_draws(newdata = polis) %>%
median_hdci() %>%
ggplot(aes(x = RATIO, y = .value)) +
geom_ribbon(aes(ymin = .lower, ymax = .upper), fill = "blue", alpha = 0.3) +
geom_line()
partial.obs <- polis %>%
mutate(
fit = fitted(polis.brm3, newdata = polis)[, "Estimate"],
resid = resid(polis.brm3)[, "Estimate"],
Obs = fit + resid
)
polis.brm3 %>%
fitted_draws(newdata = polis) %>%
median_hdci() %>%
ggplot(aes(x = RATIO, y = .value)) +
geom_ribbon(aes(ymin = .lower, ymax = .upper), fill = "blue", alpha = 0.3) +
geom_point(data = partial.obs, aes(y = Obs)) +
geom_line()
rstanarm captures the MCMC samples from
stan within the returned list. There are numerous ways to
retrieve and summarise these samples. The first three provide convenient
numeric summaries from which you can draw conclusions, the last four
provide ways of obtaining the full posteriors.
The summary() method generates simple summaries (mean,
standard deviation as well as 10, 50 and 90 percentiles).
summary(polis.rstanarm3)
##
## Model Info:
## function: stan_glm
## family: binomial [logit]
## formula: PA ~ RATIO
## algorithm: sampling
## sample: 2400 (posterior sample size)
## priors: see help('prior_summary')
## observations: 19
## predictors: 2
##
## Estimates:
## mean sd 10% 50% 90%
## (Intercept) 3.6 1.5 1.8 3.4 5.5
## RATIO -0.2 0.1 -0.3 -0.2 -0.1
##
## Fit Diagnostics:
## mean sd 10% 50% 90%
## mean_PPD 0.5 0.1 0.4 0.5 0.7
##
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
##
## MCMC diagnostics
## mcse Rhat n_eff
## (Intercept) 0.0 1.0 2112
## RATIO 0.0 1.0 2129
## mean_PPD 0.0 1.0 2381
## log-posterior 0.0 1.0 2347
##
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
Conclusions:
polis.rstanarm3$stanfit %>%
summarise_draws(
median,
HDInterval::hdi,
rhat, length, ess_bulk, ess_tail
)
We can also alter the CI level.
polis.rstanarm3$stanfit %>%
summarise_draws(
median,
~ HDInterval::hdi(.x, credMass = 0.9),
rhat, length, ess_bulk, ess_tail
)
tidyMCMC(polis.rstanarm3$stanfit,
estimate.method = "median", conf.int = TRUE,
conf.method = "HPDinterval", rhat = TRUE, ess = TRUE
)
Conclusions:
polis.rstanarm3$stanfit %>% as_draws_df()
## summarised
polis.rstanarm3$stanfit %>%
as_draws_df() %>%
summarise_draws(
"median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk"
)
## summarised on fractional scale
polis.rstanarm3$stanfit %>%
as_draws_df() %>%
mutate(across(everything(), exp)) %>%
summarise_draws(
"median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk"
)
polis.draw <- polis.rstanarm3 %>% gather_draws(`(Intercept)`, RATIO)
## OR via regex
polis.draw <- polis.rstanarm3 %>% gather_draws(`.Intercept.*|RATIO.*`, regex = TRUE)
polis.draw
We can then summarise this
polis.draw %>% median_hdci()
polis.rstanarm3 %>%
gather_draws(`(Intercept)`, RATIO) %>%
ggplot() +
stat_halfeye(aes(x = .value, y = .variable)) +
facet_wrap(~.variable, scales = "free")
We could alternatively express the parameters on the odds (odds ratio) scale.
polis.rstanarm3 %>%
gather_draws(`(Intercept)`, RATIO) %>%
group_by(.variable) %>%
mutate(.value = exp(.value)) %>%
median_hdci()
Conclusions:
polis.rstanarm3 %>% plot(plotfun = "mcmc_intervals")
This is purely a graphical depiction on the posteriors.
polis.rstanarm3 %>% tidy_draws()
polis.rstanarm3 %>% spread_draws(`(Intercept)`, RATIO)
# OR via regex
polis.rstanarm3 %>% spread_draws(`.Intercept.*|RATIO.*`, regex = TRUE)
## summarised
polis.rstanarm3 %>%
spread_draws(`(Intercept)`, RATIO) %>%
summarise_draws(
"median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk"
)
## summarised on fractional scale
polis.rstanarm3 %>%
spread_draws(`(Intercept)`, RATIO) %>%
mutate(across(everything(), exp)) %>%
summarise_draws(
"median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk"
)
polis.rstanarm3 %>%
posterior_samples() %>%
as_tibble()
polis.rstanarm3 %>%
bayes_R2() %>%
median_hdci()
brms captures the MCMC samples from stan
within the returned list. There are numerous ways to retrieve and
summarise these samples. The first three provide convenient numeric
summaries from which you can draw conclusions, the last four provide
ways of obtaining the full posteriors.
The summary() method generates simple summaries (mean,
standard deviation as well as 10, 50 and 90 percentiles).
polis.brm3 %>% summary()
## Family: binomial
## Links: mu = logit
## Formula: PA | trials(1) ~ RATIO
## Data: polis (Number of observations: 19)
## Draws: 3 chains, each with iter = 5000; warmup = 1000; thin = 5;
## total post-warmup draws = 2400
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 2.51 1.06 0.69 4.70 1.00 2228 2368
## RATIO -0.15 0.06 -0.27 -0.05 1.00 2301 2174
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Conclusions:
polis.brm3$fit %>% tidyMCMC(estimate.method = "median", conf.int = TRUE, conf.method = "HPDinterval", rhat = TRUE, ess = TRUE)
Conclusions:
polis.brm3 %>%
summarise_draws(
median,
HDInterval::hdi,
rhat, length, ess_bulk, ess_tail
)
We can also alter the CI level.
polis.brm3 %>%
summarise_draws(
median,
~ HDInterval::hdi(.x, credMass = 0.9),
rhat, length, ess_bulk, ess_tail
)
To narrow down to just the parameters of interest, see the code under the tidy_draws tab.
polis.brm3 %>% as_draws_df()
## summarised
polis.brm3 %>%
as_draws_df() %>%
summarise_draws(
"median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk"
)
## summarised on fractional scale
polis.brm3 %>%
as_draws_df() %>%
dplyr::select(starts_with("b_")) %>%
mutate(across(everything(), exp)) %>%
summarise_draws(
"median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk"
)
polis.draw <- polis.brm3 %>% gather_draws(b_Intercept, b_RATIO)
## OR via regex
polis.draw <- polis.brm3 %>% gather_draws(`b_.*`, regex = TRUE)
polis.draw
We can then summarise this
polis.draw %>% median_hdci()
polis.brm3 %>%
gather_draws(b_Intercept, b_RATIO) %>%
ggplot() +
stat_halfeye(aes(x = .value, y = .variable)) +
facet_wrap(~.variable, scales = "free")
polis.draw %>%
ggplot() +
stat_halfeye(aes(
x = .value, y = .variable,
fill = stat(ggdist::cut_cdf_qi(cdf,
.width = c(0.5, 0.8, 0.95),
labels = scales::percent_format()
))
)) +
scale_fill_brewer("Interval", direction = -1, na.translate = FALSE) +
facet_wrap(~.variable, scales = "free") +
theme_bw()
## polis.draw %>%
## ggplot() +
## stat_halfeye(aes(x = .value, y = .variable,
## fill = stat(ggdist::cut_cdf_qi(cdf,
## .width = c(0.5, 0.8, 0.95),
## labels = scales::percent_format())))) +
## scale_fill_brewer('Interval', direction = -1, na.translate = FALSE) +
## theme_bw()
We could alternatively express the parameters on the odds (odds ratio) scale.
polis.brm3 %>%
gather_draws(b_Intercept, b_RATIO) %>%
group_by(.variable) %>%
mutate(.value = exp(.value)) %>%
median_hdci()
Conclusions:
polis.brm3 %>% mcmc_plot(type = "intervals")
This is purely a graphical depiction on the posteriors.
polis.brm3 %>% tidy_draws()
polis.brm3 %>% spread_draws(b_Intercept, b_RATIO)
# OR via regex
polis.brm3 %>% spread_draws(`b_.*`, regex = TRUE)
## summarised
polis.brm3 %>%
as_draws_df() %>%
dplyr::select(starts_with("b_")) %>%
summarise_draws(
"median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk"
)
## summarised on fractional scale
polis.brm3 %>%
as_draws_df() %>%
dplyr::select(starts_with("b_")) %>%
mutate(across(everything(), exp)) %>%
summarise_draws(
"median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk"
)
polis.brm3 %>%
tidy_draws() %>%
exp() %>%
summarise_draws(median, HDInterval::hdi, rhat, ess_bulk, ess_tail) %>%
filter(variable %in% c("b_Intercept", "b_RATIO"))
polis.brm3 %>%
tidy_draws() %>%
exp() %>%
dplyr::select(starts_with("b_")) %>%
summarise_draws(median, HDInterval::hdi, rhat, ess_bulk, ess_tail)
The above is on a odd ratio scale.
polis.brm3 %>%
posterior_samples() %>%
as_tibble()
polis.brm3 %>%
bayes_R2()
## Estimate Est.Error Q2.5 Q97.5
## R2 0.3917528 0.1069452 0.1371091 0.5470194
## OR as median and hdci
polis.brm3 %>%
bayes_R2(summary = FALSE) %>%
median_hdci()
In some disciplines it is useful to be able to calculate an LD50. This is the value along the x-axis that corresponds to a probability of 50% - e.g. the switch-over point in Island perimeter to area Ratio at which the lizards go from more likely to be present to more likely to be absent. It is the inflection point.
It is also the point at which the slope (when back-transformed) is at its steepest and can be calculated as:
\[ -\frac{Intercept}{Slope} \]
polis.rstanarm3 %>%
tidy_draws() %>%
mutate(LD50 = -1 * `(Intercept)` / RATIO) %>%
pull(LD50) %>%
median_hdci()
polis.brm3 %>%
tidy_draws() %>%
mutate(LD50 = -1 * b_Intercept / b_RATIO) %>%
pull(LD50) %>%
median_hdci()
## Using emmeans
polis.grid <- with(polis, list(RATIO = seq(min(RATIO), max(RATIO), len = 100)))
newdata <- emmeans(polis.rstanarm3, ~RATIO, at = polis.grid, type = "response") %>% as.data.frame()
head(newdata)
ggplot(newdata, aes(y = prob, x = RATIO)) +
geom_point(data = polis, aes(y = PA)) +
geom_line() +
geom_ribbon(aes(ymin = lower.HPD, ymax = upper.HPD), fill = "blue", alpha = 0.3) +
scale_y_continuous("PA") +
scale_x_continuous("RATIO") +
theme_classic()
## spaghetti plot
newdata <- emmeans(polis.rstanarm3, ~RATIO, at = polis.grid) %>%
gather_emmeans_draws() %>%
mutate(.value = plogis(.value))
newdata %>% head()
ggplot(newdata, aes(y = .value, x = RATIO)) +
geom_line(aes(group = .draw), alpha = 0.01) +
geom_point(data = polis, aes(y = PA))
## Using emmeans
polis.grid <- with(polis, list(RATIO = seq(min(RATIO), max(RATIO), len = 100)))
newdata <- emmeans(polis.rstanarm3, ~RATIO, at = polis.grid, type = "response") %>% as.data.frame()
head(newdata)
ggplot(newdata, aes(y = prob, x = RATIO)) +
geom_point(data = polis, aes(y = PA)) +
geom_line() +
geom_ribbon(aes(ymin = lower.HPD, ymax = upper.HPD), fill = "blue", alpha = 0.3) +
scale_y_continuous(expression(Presence / absence ~ of ~ italic(Uta) ~ lizards)) +
scale_x_continuous(expression(Island ~ perimeter:Area ~ ratio)) +
theme_classic()
## spaghetti plot
newdata <- emmeans(polis.rstanarm3, ~RATIO, at = polis.grid) %>%
gather_emmeans_draws() %>%
mutate(.value = plogis(.value))
newdata %>% head()
ggplot(newdata, aes(y = .value, x = RATIO)) +
geom_line(aes(group = .draw), alpha = 0.01) +
geom_point(data = polis, aes(y = PA)) +
scale_y_continuous(expression(Presence / absence ~ of ~ italic(Uta) ~ lizards)) +
scale_x_continuous(expression(Island ~ perimeter:Area ~ ratio)) +
theme_classic()
## Using emmeans
polis.grid <- with(polis, list(RATIO = modelr::seq_range(RATIO, n = 100)))
newdata <- polis.brm3 %>%
emmeans(~RATIO, at = polis.grid, type = "response") %>%
as.data.frame()
head(newdata)
## Using raw data for points
newdata %>%
ggplot(aes(y = prob, x = RATIO)) +
geom_point(data = polis, aes(y = PA)) +
geom_line() +
geom_ribbon(aes(ymin = lower.HPD, ymax = upper.HPD), fill = "blue", alpha = 0.3) +
scale_y_continuous("PA") +
scale_x_continuous("RATIO") +
theme_classic()
## Using partial residuals for points
partial.obs <- polis %>%
bind_cols(
Pred = predict(polis.brm3)[, "Estimate"],
Resid = residuals(polis.brm3)[, "Estimate"]
) %>%
mutate(
Obs = round(Pred + Resid, 0)
)
newdata %>%
ggplot(aes(y = prob, x = RATIO)) +
geom_point(data = partial.obs, aes(y = Obs)) +
geom_line() +
geom_ribbon(aes(ymin = lower.HPD, ymax = upper.HPD), fill = "blue", alpha = 0.3) +
scale_y_continuous("PA") +
scale_x_continuous("RATIO") +
theme_classic()
## spaghetti plot
newdata <- emmeans(polis.brm3, ~RATIO, at = polis.grid) %>%
gather_emmeans_draws() %>%
mutate(.value = plogis(.value))
newdata %>% head()
ggplot(newdata, aes(y = .value, x = RATIO)) +
geom_line(aes(group = .draw), alpha = 0.01) +
geom_point(data = polis, aes(y = PA))
## Using emmeans
polis.grid <- with(polis, list(RATIO = seq(min(RATIO), max(RATIO), len = 100)))
newdata <- emmeans(polis.brm3, ~RATIO, at = polis.grid, type = "response") %>% as.data.frame()
head(newdata)
ggplot(newdata, aes(y = prob, x = RATIO)) +
geom_point(data = polis, aes(y = PA)) +
geom_line() +
geom_ribbon(aes(ymin = lower.HPD, ymax = upper.HPD), fill = "blue", alpha = 0.3) +
scale_y_continuous(expression(Presence / absence ~ of ~ italic(Uta) ~ lizards)) +
scale_x_continuous(expression(Island ~ perimeter:Area ~ ratio)) +
theme_classic()
## spaghetti plot
newdata <- emmeans(polis.brm3, ~RATIO, at = polis.grid) %>%
gather_emmeans_draws() %>%
mutate(.value = plogis(.value))
newdata %>% head()
ggplot(newdata, aes(y = .value, x = RATIO)) +
geom_line(aes(group = .draw), alpha = 0.01) +
geom_point(data = polis, aes(y = PA)) +
scale_y_continuous(expression(Presence / absence ~ of ~ italic(Uta) ~ lizards)) +
scale_x_continuous(expression(Island ~ perimeter:Area ~ ratio)) +
theme_classic()